Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3093

Cache the storm id to executors mapping on master to avoid repeat computation

    XMLWordPrintableJSON

Details

    Description

      Now nimbus will collect all the topologies's conf/topology-ser/storm-base to compute in a scheduling round, which is a very heavy work. The scheduling will still take to minutes even we now change to RPC heartbeats and assignment distribution.

      So i decide to redesign the scheduler, so we can only schedule the topologies that need to: that have dead workers or not enough number workers.

      Here i checkout out the code and found that the id->executors mapping is computed every time for every topology, which is really a heavy computation and totally not that necessary, because this mapping is fixed invariable for a topology unless we rebalance or kill it.

      So i refactor the code a little here, and this is more powerful after the scheduler is resigned for delta-scheduling[ which is very lightweight even there are thousands of topologies on one cluster.]

      For now this is enough for us.

      Attachments

        Issue Links

          Activity

            People

              danny0405 Danny Chen
              danny0405 Danny Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m